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Abstract. Floating point operations are fast, but require continuous ef- 
fort on the part of the user in order to ensure that the results are correct. 
This burden can be shifted away from the user by providing a library 
of exact analysis in which the computer handles the error estimates. We 
provide an implementation of the exact real numbers in the COQ proof 
assistant. This improves on the earlier COQ- implementation by O'Connor 
in two ways: we use dyadic rationals built from the machine integers and 
we optimize computation of power series by using approximate division. 
Moreover, we use type classes for clean mathematical interfaces. This 
appears to be the first time that type classes are used in heavy compu- 
tation. We obtain over a 100 times speed up of the basic operations and 
indications for improving the COQ system. 



1 Introduction 

Real numbers cannot be represented exactly in a computer. Hence, in construc- 
tive analysis [I] one approximates real numbers by rational, or dyadic numbers. 
The real numbers are the completion of the rationals. This completion con- 
struction can be organized in a monad, a familiar construct from functional 
programming (Section [21). The completion monad provides an efficient combina- 
tion of proving and computing |2j. In this way, O'Connor [3J implements exact 
real numbers and the transcendental functions on them in COQ. 

A number of possible improvements in this implementation were already 
suggested in First, we can use COQ's new machine integers; see Section [31 
Second, we can use dyadic rationals (that arc numbers of the shape n * 2*^ for 
n, e € Z, also known as infinitary floats). Third, the implementation of power 
series can be improved by using approximate division. Here we carry out all 
three optimizations. Unfortunately, changing O'Connor's implementation to use 
the new machine integers was far from trivial, as he used a particular concrete 
representation of the rationals. To avoid this in the future, we provide an abstract 
specification of the dense set as approximate rationals; see Sectional 

In Section[31we provide some abstract order theory culminating in the theory 
of approximate rationals. Section [51 deals with computing power series using 
dyadics. Section [51 describes Wolfram's algorithm to compute the square root of 
a real number. We finish with some benchmarks in Section [3 
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2 The CoQ-system 

The COQ proof assistant is based on the calculus of inductive constructions [516] , 
a dependent type theory with (co)inductive types; see [718] . In true Curry- 
Howard fashion, it is both a pure functional programming language with an 
expressive type system, and a language for mathematical statements and proofs. 
We highlight some aspects of COQ relevant for our development. 

Types and propositions. Propositions in COQ arc types [QllOj . which themselves 
have types called sorts. COQ features a distinguished sort called Prop that one 
may choose to use as the sort for types representing propositions. The distin- 
guishing feature of the Prop sort is that terms of non-Prop type may not depend 
on the values of inhabitants of Prop types (that is, proof terms). This regime of 
discrimination establishes a weak form of proof irrelevance, in that changing a 
proof can never affect the result of value computations. On a practical level, this 
lets COQ safely erase all Prop components when extracting certified programs to 
OCAML or Haskell. We should note however, that in practice, COQ's extrac- 
tion mechanism is still very hard to use for programs with the complexity, 
in terms of depth of definitions, that we are interested in |12ll3j . 

Equality, setoids, and rewriting Because the COQ type theory lacks quotient 
types (as it would make type checking undecidable) , one usually bases abstract 
structures on a setoid ('Bishop set'): a type equipped with an equivalence rela- 
tion |lll4j . This leads to a naive set theory as described by Palmgren [15] . When 
the user attempts to substitute a given (sub)term using an equality, the system 
keeps track of, resolves, and combines proofs of equivalence [16] . 

The 'native' notion of equality in COQ, Leibniz equality, is that of terms being 
convertible, naturally reified as a proposition by the inductive type family eq with 
single constructor eq_refl : V (T : Type)(x : T), x = x, where a = b is notation for 
eq T a b. Since convertibility is a congruence, a proof of a = b lets us substitute 
b for a anywhere inside a term without further conditions. Our interest is in 
more complicated equalities, so we diverge from COQ tradition and reserve — for 
setoid equality. Rewriting with — does give rise to side conditions. For instance, 
consider formal fractions of integers as a representation of rationals. Rewriting a 
subterm using such an equality is permitted only if the subterm is an argument of 
a function that has been proven to respect the equality. Such a function is called 
Proper, and that property must be proved for each function in whose arguments 
we wish to enable rewriting. 

Type classes. Type classes have been a great success story in the Haskell 
functional programming language, as a means of organizing interfaces of abstract 
structures. COQ's type classes provide a superset of their functionality, but are 
implemented in a different way. 

In Haskell and Isabelle, type classes and their instances are second class. 
They are handled as specialized syntactic constructs whose semantics are given 
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specifically by the type class apparatus. By contrast, the expressivity of depen- 
dent types and inductive families as supported in COQ, combined with the use 
of pre-existing technology in the system (namely proof search and implicit argu- 
ments) enable a first class type class implementation jl7j : classes arc ordinary 
record types ('dictionaries'), instances are ordinary constants of these record 
types (registered as hints with the proof search machinery), class constraints are 
ordinary implicit parameters, and instance resolution is achieved by augmenting 
the unification algorithm to invoke ordinary proof search for implicit arguments 
of class type. Thus, type classes in COQ are realized by relatively minor syntactic 
aids that bring together existing facilities of the theory and the system into a 
coherent idiom, rather than by introduction of a new category of qualitatively 
different definitions with their own dedicated semantics. 

We use the algebraic hierarchy based on type classes and its abstract spec- 
ification of N, K and Q described in [18]. Unfortunately, we should note that 
we have clearly met the efficiency problems connected to the current implemen- 
tation of type classes in COQ. Luckily, these efficiency problems are limited to 
instance resolution which is only performed at compile time. Type classes have 
only a very minor effect on the computation time of type checked terms due to 
the absence of code inlining; see Section [7] for timings. 

Virtual machine and machine integers. COQ includes a virtual machine |19] . 
vm_compute, based on Ocaml's virtual machine to allow efficient evaluation. 
Both the abstract machine and its compilation scheme have been proved correct, 
in CoQ, with respect to the weak reduction semantics. However, we still need to 
extend our trusted core to a bigger kernel, as the implementation has not been 
formally verified. 

Machine integers were also added to the COQ system ^U\. The usual evalu- 
ation inside COQ (compute) uses a special inductive type for cyclic integers, but 
the virtual machine uses Ocaml's machine integers. This allows for a big speed- 
up, for which we pay by having to trust (the virtual machine and) that OCAML 
treats these integers correctly. The time difference between computation with 
Coq's int and Ocaml's BigJnt is about a factor of 20 [H] on primality tests. 

3 Metric spaces 

Having completed our brief description of the COQ-system, we now turn to 
O'Connor's formalization of exact real numbers [5]. Traditionally, a metric space 
is defined as a set X with a metric function d : X x X R+ satisfying certain 
axioms. We use a more relaxed definition of a metric space that does not require 
the metric be a function; see also [52] . The metric is represented via a (respectful) 
ball relation B : Q+ X ^ X ^ Prop satisfying: 

msp_refl : Vxe, x x 

msp_sym : Vx y e, x y B^ y x 

msp_triangle : Wx y z ei e2, B^^ x y B^, y z ^ B^j^+j.^ x z 
msp_closed : Vxye, (V(5, 'B^_^g x y) — >• B^ x i/ 
msp_eq : Vxy, (Ve, 'B^x y) ^ x = y 
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The ball relation x y expresses that the points x and y are within e of each 
other. We call this a ball relationship because the partially applied relation 

X : X ^ Prop is a predicate that represents the closed ball of radius e 
around the point x. For example, the ball relation on Q is B^ x y :=\x — y\ < e. 

We will introduce the completion of a metric space as a monad. In order to 
do this we will first introduce monads. 

Monads. Moggi [23| recognized that many non-standard forms of computation 
may be modeled by monad^. Wadler [St] popularized their use in functional 
programming. Monads are now an established tool to structure computation 
with side-effects. For instance, programs with input X and output Y which have 
access to a mutable state S can be modeled as functions of type X x S ^ Y x S ^ 
or equivalently X ^ (Y x S)^ . The type constructor DJIY ;= {Y x S)^ is 
an example of a monad. Similarly, partial functions may be modeled by maps 
X Y±, where Y± := Y + () is a monad. 

The formal definition of a (strong) monad is a triple return, bind) con- 
sisting of a type constructor DJt and two functions: 

return : X MX 
bind : {X MY) MX WIY 

We will denote return x as x, and bind / as /. These two operations must satisfy: 

bind return a = a 
fa^ fa 
fig a) = bind (f o g) a 
Completion monad. The completion of a metric space X is defined by: 
€X:={f:Q+^X\ Vei 62, B,^+,^ (/ ei) (/ £2)}. 

Given metric spaces X and Y, a function f : X ^ Y is uniformly continuous 
with modulus ^/ : Q+ Q+ if: 

yexiX2, B^^^xi X2 -)-'B^{f xi) if X2). 

Completion is a monad on the category of metric spaces with uniformly continu- 
ous functions. The function return : X CX defined by Ax e, x is the embedding 
of a metric space in its completion. Moreover, a uniformly continuous function 
/ : X — > €Y with modulus fif can be lifted to operate on complete metric 
spaces as bind/ : €X €Y defined by Axe, f {x{fif^)) |. In fact, the text 
above contains a white lie: we need a minor restriction to prclcngth spaces [3]. 

One advantage of this approach is that it helps us to work with simple rep- 
resentations. Let R := Then to specify a function from R -> R, we define 
a uniformly continuous function / : (Q — > R, and obtain / : R — > R as the 
required function. Hence, the completion monad allows us to do in a structured 
way what was already folklore in constructive mathematics: to work with simple, 
often decidable, approximations to continuous objects. 

^ In category theory one would speak about the Kleisli category of a (strong) monad. 
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4 Abstract interfaces using type classes 

An important part of this work is to further develop the algebraic hierarchy 
based on type classes by Spitters and van der Weegen [TH]. Especially, we have 
formalized some order theory and developed interfaces for mathematical oper- 
ations common in programming languages such as shift and power. This layer 
of abstraction makes both proof engineering and programming more flexible: it 
avoids duplication of code, it introduces a canonical way to refer to operations 
and properties, both by names and notations, and it allows us to easily swap 
different implementations of number representations and their operations. First 
we will briefly recap the design decisions made in [15] . 

Algebraic structures are expressed in terms of a number of carrier sets, a 
number of relations and operations, and a number of laws that the operations 
satisfy. One way of describing such a structure is by a bundled representation: 
one uses a dependently typed record that contains the carrier, operations and 
laws. For example a semigroup can be represented as follows. (The fields sg_car 
and sg_proper support our explicit handling of naive set theory in type theory.) 

Record SemiGroup : Type := { 
sg_car :> Setoid ; 
sg_op : sg_car — > sg_car — > sg_car ; 
sg.proper : Proper ((=) (=) (=)) sg_op ; 
sg_ass ; V X y z, sg_op x (sg_op y z) = sg_op (sg_op x y) z) } 

However, this approach has some serious limitations, the most important one 
being a lack of support for sharing components. For example, suppose we group 
together two CommutativeMonoids in order to create a SemiRing. Now awkward 
hacks are necessary to establish equality between the carriers. A second problem 
is that if we stack up these records to represent higher structures the projection 
paths become increasingly long. 

Historically these problems have been an acceptable trade-off because un- 
bundled representations, in which the carrier and operations are parameterized, 
introduce even more problems. 

Record SemiGroup {A} (e : A — >• A — > Prop) (sg_op : A ^ A ^ A) : Prop := { 
sg_proper : Proper (e =^ e e) sg_op ; 
sg_ass : V X y z, e (sg_op x (sg_op y z)) (sg_op (sg_op x y) z) } 

There is nothing to bind notation to, no structure inference, and declaring and 
passing requires too much manual bookkeeping. Spitters and van der Weegen 
have proposed a use of COQ's new type class machinery that resolves many of 
the problems of unbundled representations. Our current experiment confirms 
that this is a viable approach. 

An alternative solution is provided by packed classes |25j which use an al- 
ternative, and older, implementation of a semblance of type classes: canonical 
structures. Yet another approach would use modules. However, as these are not 
fist class, we would be unable to define, e.g. homomorphisms between algebraic 
structures. 

An operational type class is defined for each operation and relation. 
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Class Equiv A := equiv: relation A. 

Infix "=" := equiv: type_scope. 

Class RingPlus A := ring_plus: A ^ A ^ A. 

Infix "+" := ring_plus. 

Now an algebraic structure is just a type class living in Prop that is parametrized 
by its carrier, relations and operations. This class contains all laws that the op- 
erations should satisfy. Since the operations are unbundled we can easily support 
sharing. For example let us consider the SemiRing interface. 

Class SemiRing A {e : Equiv A} {plus: RingPlus A} 

{mult: RingMult A} {zero: RingZero A} {one: RingOne A} : Prop := { 
semiring_mult_monoid :> OCommutativeMonoid A e mult one ; 
semiring_plus_monoid :> SCommutativeMonoid A e plus zero ; 
semiring_distr :> Distribute (.*.) (+) ; 
semiring_left_absorb :> LeftAbsorb (.*.) }. 

Without type classes it would be a burden to manually carry around the carrier, 
relations and operations. However, because these parameters are just type class 
instances, the type class machinery will perform that job for us. For example. 

Lemma example '{SemiRing R}x:l*x = x + 0. 

The backtick instructs COQ to automatically insert implicit declarations, namely 
e plus mult zero one. It further lets us omit a name for the SemiRing R parameter 
itself as well. All of these parameters will be given automatically generated names 
that we will never refer to. Furthermore, instance resolution will automatically 
find instances of the operational type classes for the written notations. Thus the 
above is really: 

Lemma example {R e plus mult zero one} {P : ©SemiRing R e plus mult zero one} x : 
Oequiv R e 

(@ring_mult R mult (@ring_one R one) x) 
(@ring_plus R plus x (@ring_zero R zero)). 

The syntax :> in the definition of SemiRing declares certain fields as substruc- 
tures. That means, a SemiRing can be seen as a CommutativeMonoid and each time 
a CommutativeMonoid instance is needed, a SemiRing can be used instead. This 
syntax should not be confused with the similar syntax for coercions in records 
(e.g. in the bundled representation of a SemiGroup on p.[5|). 

This approach to interfaces proved useful to formalize a standard algebraic 
hierarchy. Combined with category theory and universal algebra, N and TL are 
represented as interfaces specifying an initial SemiRing and initial Ring [18] . These 
abstract interfaces for the naturals and integers make it easier to change the 
concrete representation in the future. No such simple specification for (Q seems 
to exists, so we choose to specify it as the field of fractions of TL. More precisely, 
Q is specified as a Field containing TL that moreover can be embedded into the 
field of fractions of TL. 

Inductive Frac R '{e : Equiv R} '{zero : RingZero R} : Type := 

frac { num : R ; den : R ; den_nonzero : den 7^ }. 
Class RationalsToFrac (A : Type) := rationals_to_frac : V B '{Integers B}, A — >■ Frac B. 
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Class Rationals A {e plus mult zero one opp inv} '{U : IRationalsToFrac A} : Prop :— { 
rationalsjield :> ©Field A e plus mult zero one opp inv ; 
rationals_frac :> V '{Integers Z}, Injective (rationals_to_frac A Z) ; 
rationals_frac_mor :> V '{Integers Z}, SemiRing_Morphism (rationals_to_frac A Z) ; 
rationals_embedJnts :> V '{Integers Z}, Injective (integers_to_ring Z A) }. 

4.1 Order theory 

To abstract from N, (Q and R and their various implementations, we provide 
a basic library for ordered algebraic structures. For example, 

Class RingOrder '{Equiv A} '{RingPlus A} '{RingMult A} '{RingZero A} 
(o : Order A) := { 
ringorder_partialorder :> PartialOrder (<) ; 
ringorder_plus :> '(OrderPreserving (z +)); 
ringorder_mult : '(0 < x — > V y, < y — > < x * y) }. 

To apply this to N, which is merely a semiring, we introduce the, apparently 
new, notion of a SemiRingOrder. Every RingOrder is a SemiRingOrder. 

Class SemiRingOrder '{Equiv A} '{RingPlus A} '{RingMult A} '{RingZero A} 
(o : Order A) := { 
srorder_partialorder :> PartialOrder (<) ; 
srorder_plus :'(x<yo3z, 0<zAy = x + z); 
srorder_mult : '(0 < x ^ V y, < y ^ < x * y) }. 

This allows us to refer by canonical names to lemmas as those shown below for 
N, Z, (Q and the dyadics. 

Lemma plus_compat xi yi X2 y2 : xi < yi — 5> X2 < y2 ^ xi + X2 < yi + y2 ■ 
Lemma sprecedes_l_2 : 1 < 2. 

For instances of N, Q it is easy to define an order satisfying these interfaces: 

Instance nat_precedes '{Naturals N} : Order N j 10 A x y, 3 z, y = x + z. 

However, often we encounter an a priori different order on a structure, most likely 
an order defined in COQ's standard library (like NIe on N). Therefore we prove 
that an arbitrary order satisfying these interfaces while also being a TotalOrder 
uniquely specifics the order on N, Z and Q. For example: 

Context '{Naturals N} '{Naturals N2} {f : N ^ N2} '{!SemiRing_Morphism f} 

{ol : Order N} '{ISemiRingOrder ol} '{ITotalOrder ol} 

{o2 ; Order N2} '{ISemiRingOrder o2} '{ITotalOrder o2}. 
Global Instance: OrderEmbedding f. 

Unfortunately COQ has no support to have an argument be 'inferred if possible, 
generalized otherwise'; see (TB]. When declaring a parameter of RingOrder, one is 
often in a context where most of its components are already available. Usually, 
only the parameter Order has to be introduced. The current workaround in these 
cases involves providing names for components that are then never referred to, 
which is a bit awkward. In the above it would much nicer to write: 

Context '{Naturals N} '{Naturals N2} {f : N ^ N2} '{!SemiRing_Morphism f} 

'{ISemiRingOrder N} '{ITotalOrder N} '{ISemiRingOrder N2} '{ITotalOrder N2}. 
Global Instance: OrderEmbedding f. 
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4.2 Basic operations 

The operation nat_pow is most commonly, but inefficiently, defined as repeated 
multiplication and the operation shifti is defined as repeated multiplication by 
2. Instead we specify the desired behavior of these operations. This approach 
allows for different implementations for different number representations and 
avoids definitions and proofs becoming implementation dependent. 

We introduce interfaces that specify the behavior of the operations abs, shifti, 
nat_pow and int_pow. Again there arc various ways of specifying these interfaces: 
with iT-types, bundled or unbundled. In general, Z'-types are convenient for 
functions whose specification is easy, for example; 

Class Abs A '{Equiv A} '{Order A} '{RingZero A} '{Grouplnv A} 

:= abs_sig: V (x : A), { y : A | (0 < x ^ y = x) A (x < ^ y = -x)}. 
Definition abs '{Abs A} := A x : A, ' (abs_sig x). 

However, for more complex operations, such as shifti. such an interface is differ- 
ent from the usual mathematical specification because we cannot quantify over 
all possible input values. Now there are two ways: a bundled or an unbundled 
interface. Since these interfaces are not used for hierarchies the disadvantages of 
the latter do not apply. Let us first describe the former approach. 

Class ShiftL A B '{Equiv A} '{Equiv B} '{RingOne A} 

'{RingPlus A} '{RingMult A} '{RingZero B} '{RingOne B} '{RingPlus B} := { 
shifti : A ^ B ^ A ; 

shiftLproper : Proper ((=) =^ (=) (=)) shifti ; 
shiftLO :> Rightldentity shifti ; 
shiftLS : V X n, shifti x (1 + n) = 2 * shifti x n }. 
Infix "<c; " := shifti (at level 33, left associativity). 

Although this interface seems reasonable, it does not work well in COQ. The simpi 
tactic which is used to simplify a goal will unfold occurrences of shifti to their 
underlying definition (for example in case of BigN , the expression x <c; n becomes 
BigN. shifti X n). This is rather inconvenient because COQ will then be unable to 
use lemmas concerning for rewriting. This problem is caused because shifti is a 
projection of a record, which is in fact an i-redex (reduction of pattern-matching 
over a constructed term) that will be unfolded by simpl. Currently there seems 
to be no way to adjust the behavior of simpl to remove this inconvenience. A 
similar problem was already observed in Ssreflect pS] . 

Instead we use an unbundled interface, which has a lot in common with our 
interfaces for algebraic structures. Now shifti no longer contains an i-redex. 

Class ShiftL A B := shifti: A ^ B ^ A. 

Infix "<g; " := shifti (at level 33, left associativity). 

Class ShiftLSpec A B (si : ShiftL A B) '{Equiv A} '{Equiv B} '{RingOne A} 

'{RingPlus A} '{RingMult A} '{RingZero B} '{RingOne B} '{RingPlus B} := { 
shiftLproper : Proper ((=) ^ (=) ^ (=)) (<) ; 
shiftLO :> Rightldentity (<) ; 
shiftLS :Vxn, x<(l + n) = 2*x<n}. 

We do not specify shifti as shifti x n = x * 2 " n since on the dyadics we cannot 
take a negative power while we can shift by a negative integer. 
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4.3 Decision procedures 

The Decision type class collects types with a decidable equality |18) . 
Class Decision P := decide: sumbool P P). 

Using this type class we can declare a parameter '{V x y, Decision (x < y)} to 
describe a decider for < and say decide (x < y) to decide whether x < y or not. 
This type class allows us to easily define additional deciders, like the one for the 
strict order. Wc have to be careful however. Consider the order on the dyadics. 

Global Instance dy_precedes: Order Dyadic := A (x y : Dyadic), 
ZtoQ (mant x) * 2 " (expo x) < ZtoQ (mant y) * 2 " (expo y) 

Now, decide (x < y) is actually Odecide Dyadic (x < y) dyadic_dec, where dyadic_dec 
is the computational conclusion of the decision. Due to eager evaluation, and 
the absence of dead code removal, the second argument, x < y, is also evaluated. 
Evaluation of this argument results in a conversion of x and y into Q, as described 
above. But since this argument is just a proposition it is later thrown away. We 
avoid this problem introducing a A-abstraction. 

Definition decide_rel '(R : relation A) {dec : V x y, Decision (R x y)} 
(x y : A) : Decision (R x y) := dec x y. 

Wc can now define: 

Context '{IPartialOrder (<) } {ITotalOrder (<) } '{V x y, Decision (x < y)}. 
Global Program Instance sprecedes_dec: V x y, Decision (x < y) | 9 := A x y, 

match decide_rel (<) y x with 

I left E => right _ 

i right E => left _ 

end. 



4.4 Approximate rationals 

To make our implementation of the reals independent of the underlying dense 
set, we provide an abstract specification of approximate rationals inspired by the 
notion of approximate fields which is used in the Haskell implementation of 
the exact reals by Bauer and Kavler [27| . We provide an implementation of this 
interface by dyadics based on COQ's machine integers. 

Our interface describes an ordered ring containing Z that is dense in Q. Here 
Z are the binary integers from COQ's standard library, and Q are the rationals 
based on these binary integers. We do not parametrize by arbitrary integer and 
rational implementations because they are hardly used for computation. 

Also, for efficient computation, this interface contains the operations: ap- 
proximate division, normalization, an embedding of Z, absolute value, power by 
N, shift by Z, and decision procedures for both equality and order. 

Class AppDiv AQ := app_div : AQ -i> AQ Z ^ AQ. 
Class AppApprox AQ := app_approx : AQ — > Z — > AQ. 
Class AppRationals AQ {e plus mult zero one inv} '{lOrder AQ} 
{AQtoQ ; Coerce AQ Q_as_MetricSpace} '{lApplnverse AQtoQ} 
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{ZtoAQ : Coerce Z AQ} '{lAppDiv AQ} '{lAppApprox AQ} 
■{!Abs AQ} '{IPow AQ N} '{IShiftL AQ Z} 

'{V X y : AQ, Decision (x = y)} '{V x y : AQ, Decision (x < y)} : Prop := { 
aq_ring :> ORing AQ e plus mult zero one inv ; 
aq_order_embed :> OrderEmbedding AQtoQ ; 
aq_ring_morphism :> SemiRing_Morphism AQtoQ ; 
aq_dense_embedding :> DenseEmbedding AQtoQ ; 
aq_div : V X y k, B2fc('app_div x y k) ('x / 'y) ; 
aq_approx : V x k, B2fc ('app_approx x k) ('x) ; 
aq_shift :> ShiftLSpec AQ Z (<) ; 
aq_nat_pow :> NatPowSpec AQ N (") ; 
aqJnts_mor :> SemiRing_Morphism ZtoAQ }. 

O'Connor [5] keeps the size of the rational numbers smah to avoid efHciency 
problems. He introduces a function approx x e that yields the 'simplest' ratio- 
nal number between x — e and x + e. In our interface we modify the approx 
function slightly: app_approx x k yields an arbitrary element between x — 2*^ and 
X +2*'. Using this function we define the compress operation on the real numbers; 
compress := bind (A e, app_approx x (Qdlog2 e)) such that compress x = x. 

In Section [5] we will explain our choice of using a power of 2 to specify the 
precision of app_div and app_approx. In the remainder of this section we briefly 
describe our implementation by the dyadics. 

The dyadic rationals are numbers of the shape ti * 2*^ for n, e e In or- 
der to remain independent of an integers implementation, we abstract over it. 
For our eventual implementation of the approximate rationals we use COQ's 
machine integers, bigZ. Now given an arbitrary integer implementation Int it is 
straightforward to define the dyadics. Here we will just show the ring operations. 

Notation "x \ p" := (exist _ x p) (at level 20). 
Record Dyadic := dyadic { mant : Int ; expo : Int }. 
Infix "$" := dyadic (at level 80). 

Global Instance dyjnject: Coerce Int Dyadic := A x, x $ 0. 
Global Instance dy_opp: Grouplnv Dyadic := A x, —mant x $ expo x. 
Global Instance dy_mult: RingMult Dyadic := A x y, mant x * mant y $ expo x + expo y. 
Global Instance dy_0: RingZero Dyadic := ('0:Dyadic). 
Global Instance dy_l: RingOne Dyadic := ('l:Dyadic). 
Global Program Instance dy_plus: RingPlus Dyadic := A x y, 
if decide_rel (<) (expo x) (expo y) 

then mant x + mant y <g; (expo y — expo x) f _ $ min (expo x) (expo y) 
else mant x <^ (expo x — expo y) f _ + mant y $ min (expo x) (expo y). 

In this code shifti has type Int — !> Int+— > Int, where lnt+ is a iT-type describing the 
non-negative elements of Int. Therefore, in the definition of dy_plus we have to 
equip expo y — expo x with a proof that it is in fact non-negative. 

5 Power series 

Elementary transcendental functions as exp, sin. In and arctan can be defined by 
their power series. If the coefficients of a power series are alternating, decreas- 
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ing and have limit 0, then wc obtain a fast converging sequence with an easy 
termination proof. For ~1 < x < 0, 

oo 

Ex 

is of this form. To approximate exp x with error e we take the partial sum until 
^ < e. In order to implement this efhciently we use a stream representing the 
series and define a function that sums the required number of elements. For 
example, the series 1, a, a^, a'^, ... is defined by the following stream. 

CoFixpoint powers_help (c : A) : Stream A := Cons c (powers_help (c * a)). 
Definition powers : Stream A := powers_help 1. 

Streams in COQ, like lists in Haskell, are lazy. So, in the example the multi- 
plications are accumulated. 

Since COQ only allows structural recursion (and guarded co-recursion) it re- 
quires some work to convince COQ that our algorithm terminates. Intuitively, one 
would describe the limit as an upperbound of the required number of elements 
using the Exists predicate. 

Inductive Exists A (P : Stream A — Prop) (x : Stream) : Prop := 
I Here : P x ^ Exists P x 
Further : Exists P (tl x) Exists P x. 

This approach leads to performance problems. The upperbound, encoded in 
unary format, may become very large while generally only a few terms are nec- 
essary. Due to vm_compute's eager evaluation scheme, this unary number will be 
computed before summing the series. Instead we use LazyExists [28) . 

Inductive LazyExists A (P : Stream A — >■ Prop) (x : Stream A) : Prop := 
LazyHere : P x — > LazyExists P x 
I LazyFurther : (unit — > LazyExists P (tl x)) — > LazyExists P x. 

O'Connor's InfiniteAlternatingSum s returns the real number represented by the 
infinite alternating sum over s, where the stream s is decreasing, non-negative 
and has limit 0. We have extended this in two ways. First, by generalizing some 
of the work to abstract structures. Second, as we do not have exact division on 
approximate rationals, we extended his algorithm to work with approximate divi- 
sion. The latter required changing InfiniteAlternatingSum s to InfiniteAlternatingSum 
n d which computes the infinite alternating sum of the stream Ai, This allows 
us to postpone divisions. Also, we have to determine both the length of the par- 
tial sum and the required precision of the divisions. To do so we find a k such 
that: 

Bj (app.div nu dk (log^) + ^) 0. (1) 

Now k is the length of the partial sum, and ^ is the required precision of 
division. Using O'Connor's results we have verified that these values are correct 
and such a k indeed exists for a decreasing, non-negative stream with limit 0. 
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As noted in Section [4. 4[ we have specified the precision of division in powers 
of 2 instead of using a rational value. This allows us to replace ([Ij with: 

B| (app_div uk 4 (log e - (fc + 1)) + K (log e- {k + 1))) 0. 

Here k is the length of the partial sum, and 2', where / = log e — (fc + 1), is 
the required precision of division. This variant can be implemented without any 
arithmetic on the rationals and is thus much more efficient. 

This method gives us a fast way to compute the infinite alternating sum, in 
practice, only a few extra terms have to be computed and due to the approximate 
division the auxiliary results are kept as small as possible. 

Using this method to compute infinite alternating sums we have so far imple- 
mented exp and arctan. Furthermore, we extend the exponential to its complete 
domain by repeatedly applying the following formula. 

exp X = {exp{x < 1))^ (2) 

Our tests have shown that reducing the input to a value between —2*^ < a; < 
for 50 < fc yields major performance improvements as the series will converge 
much faster. For higher precisions setting it to 75 < fc gives even better results. 
By defining arctan on [0, 1), we can define the Machin-like formula 

TT := 176 * arctan h 28 * arctan 48 * arctan h 96 * arctan . 

57 239 682 12943 

Since we do not have exact division on the approximate rationals, we see here 
the purpose of parameterizing infinite sums by two streams. 

6 Square root 

We use Wolfram's algorithm p. 913] for computing the square root. Its com- 
plexity is linear, in fact it provides a new binary digit in each step. We aim to 
investigate Newton iteration in future work. 

Context '(Pa : 1 < a < 4). 
Fixpoint AQroot_loop (n : nat) : AQ * AQ := 
match n with 
I ^ (a, 0) 
I S n ^ 

let (r, s) := AQrootJoop n in 
if decide.rel (<) (s + 1) r 

then ((r - (s + 1)) < (2:Z), (s + 2) < (1:Z)) 
else (r < (2:Z), s < (1:Z)) 
end. 

Three easy invariants allow us to prove this series converges to the square root. 
Lemma AQrootJoopJnvariantl (n : nat) : 

snd (AQrootJoop n) * snd (AQroot_loop n) + 4 * fst (AQrootJoop n) = 4*4"n*a. 
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Expression 


Decimals 


O'Connor 


Krebbers / Spitters 


sin (sin (sin 1)) 


10,000 


71s 


5s 


cos (10^°) 


10,000 


2.7s 


0.6s 


tan (\/2) + arctanh (sin 1) 


500 


133s 


2.2s 



Table 1. Haskell, compiled with ghc version 6.12.1, using -02. 



Expression 


Decimals 


O'Connor 


Krebbers/Spitters 


TT 


300 


55s 


0.8s 


exp (exp (exp (i))) 


25 


123s 


0.23s 


exp TT — TT 


25 


52s 


0.1s 


arctan n 


25 


134s 


1.0s 



Table 2. COQ trunk revision 13841. 



Lemma AQrootJoopJnvariant2 (n : nat) : 

fst (AQrootJoop n) < 2 * snd (AQroot.loop n) + 4. 
Lemma AQrootJoop_fst_bound (n : nat) : 

fst (AQrootJoop n) < 2 (3 + n). 

7 Benchmarks 

The first step in this research was to create a Haskell prototype based on 
O'Connor's implementation of the real numbers in Haskell [5]. The second 
step was to implement this prototype in COQ. Currently, our COQ development 
contains the field operations, computation of power series, exp, arctan, tt and the 
square root. Apart from the square root, the correctness of these operations has 
been verified in the COQ system. 

In this section we present some benchmarks comparing the old and the 
new implementation, both in Haskell and COQ. All benchmarks have been 
carried out on an Intel Core Quad 2.4 GHz with 8GB of memory running 
Debian gnu/Linux with kernel 2.6.32. The sources of our developments can 
be found at http://robbertkrebbers.nl/research/reals. 

Table [T] shows some benchmarks in Haskell with compiler optimizations 
enabled (-02) and Table [2] compares our COQ implementation with O'Connor's. 
More extensive benchmarking shows that our Haskell implementation gener- 
ally benefits from a 15 times speed up while the speed up in COQ is usually 
more than a 100 times. This difference is explained by the fact that O'Connor's 
Haskell implementation already used fast integers, while his COQ implementa- 
tion did not. In the same times as shown in Table [5] for the old implementation, 
the new implementation is able to compute the first 2,000 decimals of tt, 450 
decimals of exp (exp (exp (^))), 425 decimals of exp tt — tt and 85 decimals of 
arctan tt. This is an improvement of up to 18 times of the number of decimals. 

It is interesting to notice that tt and arctan benefit the least from our im- 
provements, as we are unaware of an optimization similar to the squaring trick 
for exp (Section [5j Equation (2). 
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We conclude this section with a comparison between the performance of 
Wolfram's algorithm in COQ and Haskell. The Haskell prototype (without 
compiler optimizations) is quite fast, computing 10,000 iterations (giving 3,010 
decimals) of \/2 takes 0.2s. In COQ it takes 11.6s using type classes and 11.3s 
without type classes. Here we exclude the time spend on type class resolution. 
Thus type classes cause only a 3% performance penalty on computations. 

Unfortunately, the COQ-implementation is slow compared to Haskell. Lau- 
rent Thery suggested that this is due to the representation of the fast integers, 
which uses a tree with a fixed depth and when the size of the integer becomes 
too big uses a less optimal representation. Increasing the size of the tree rep- 
resentation and avoiding an inefhciency in the implementation of shifts reduces 
this time to 7.5s. 

8 Conclusions and Related work 

We have greatly improved the performance of real number computation in COQ 
using Coq's new machine integers. We produced highly structured and abstract 
code using type classes with no apparent performance penalty. Moreover, COQ's 
notation mechanism combined with Unicode characters gives nicely readable 
statements and proofs. Type classes were a great help in our work. However, 
the current implementation of instance resolution is still experimental and at 
times too slow (at compile time). 

Canonical structures provide an alternative, and partially complementary, 
implementation of type classes |30j . By choice, canonical structures restrict to de- 
terministic proof search, this makes them more efficient, but also somewhat more 
intricate to use. The use of canonical structures by the Ssreflect team [5S] 
makes it plausible that with some effort we could have used canonical structures 
for our work instead. However, the SSREFLECT-library is currently not suited 
for setoids which are crucial to us. The integration of unification hints |31| into 
COQ may allow a tighter integration of type classes and canonical structures. 

We needed to adapt our correctness proofs to prevent the virtual machine 
from eagerly evaluating them. Lazy evaluation for Prop would have allowed us 
to use the original proofs. 

The experimental native_compute performs evaluation by compilation to native 
OCAML code. This approach uses the OCAML compiler available and is interest- 
ing for heavy compilation. Our first experiments indicate a 10 times speed up 
with Wolfram iteration. Unfortunately, native_compute does not work with COQ 
trunk yet, so we were unable to test it with our implementation of the reals. 

The Flocq project [32] formalizes floating-points in COQ. It provides a li- 
brary of theorems on a multi-radix multi-precision arithmetic and supports ef- 
ficient numerical computations inside COQ. However, the current library is still 
too limited for our purposes, but in the future it should be possible to show that 
they form an instance of our approximate rationals. This may allow us to gain 
some speed by taking advantage of fine grained algorithms on the floats instead 
of our more straightforward ones. 
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The encoding of real numbers as streams of 'bits' is potentially interesting. 
However, currently there is a big difference in performance. The computation 
of 37 decimals of the square root of 1/2 by Newton iteration [33], using the 
framework described in |34l35j . took 12s. This should be compared with our use 
of the Wolfram iteration, which gives only linear convergence, but with which 
we nevertheless obtain 3,000 decimals in in a similar time. On the other hand, 
the efhciency of tt in their framework is comparable with ours. Berger |36| . too, 
uses co-induction for exact real computation. 

The present work is part of a larger program to use constructive mathematics 
based on type theory as a programming language for exact analysis. This should 
culminate in a numerical ODE-solver. 
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